Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 39
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
J Med Chem ; 67(2): 1544-1562, 2024 Jan 25.
Artigo em Inglês | MEDLINE | ID: mdl-38175811

RESUMO

NLRP3 is a molecular sensor recognizing a wide range of danger signals. Its activation leads to the assembly of an inflammasome that allows for activation of caspase-1 and subsequent maturation of IL-1ß and IL-18, as well as cleavage of Gasdermin-d and pyroptotic cell death. The NLRP3 inflammasome has been implicated in a plethora of diseases including gout, type 2 diabetes, atherosclerosis, Alzheimer's disease, and cancer. In this publication, we describe the discovery of a novel, tricyclic, NLRP3-binding scaffold by high-throughput screening. The hit (1) could be optimized into an advanced compound NP3-562 demonstrating excellent potency in human whole blood and full inhibition of IL-1ß release in a mouse acute peritonitis model at 30 mg/kg po dose. An X-ray structure of NP3-562 bound to the NLRP3 NACHT domain revealed a unique binding mode as compared to the known sulfonylurea-based inhibitors. In addition, NP3-562 shows also a good overall development profile.


Assuntos
Diabetes Mellitus Tipo 2 , Gota , Camundongos , Animais , Humanos , Proteína 3 que Contém Domínio de Pirina da Família NLR/metabolismo , Inflamassomos/metabolismo , Diabetes Mellitus Tipo 2/metabolismo , Macrófagos/metabolismo , Interleucina-1beta/metabolismo , Caspase 1/metabolismo
2.
J Cheminform ; 15(1): 119, 2023 Dec 11.
Artigo em Inglês | MEDLINE | ID: mdl-38082357

RESUMO

Time-split cross-validation is broadly recognized as the gold standard for validating predictive models intended for use in medicinal chemistry projects. Unfortunately this type of data is not broadly available outside of large pharmaceutical research organizations. Here we introduce the SIMPD (simulated medicinal chemistry project data) algorithm to split public data sets into training and test sets that mimic the differences observed in real-world medicinal chemistry project data sets. SIMPD uses a multi-objective genetic algorithm with objectives derived from an extensive analysis of the differences between early and late compounds in more than 130 lead-optimization projects run within the Novartis Institutes for BioMedical Research. Applying SIMPD to the real-world data sets produced training/test splits which more accurately reflect the differences in properties and machine-learning performance observed for temporal splits than other standard approaches like random or neighbor splits. We applied the SIMPD algorithm to bioactivity data extracted from ChEMBL and created 99 public data sets which can be used for validating machine-learning models intended for use in the setting of a medicinal chemistry project. The SIMPD code and simulated data sets are available under open-source/open-data licenses at github.com/rinikerlab/molecular_time_series.

3.
Nat Commun ; 14(1): 6651, 2023 10 31.
Artigo em Inglês | MEDLINE | ID: mdl-37907461

RESUMO

The lead optimization process in drug discovery campaigns is an arduous endeavour where the input of many medicinal chemists is weighed in order to reach a desired molecular property profile. Building the expertise to successfully drive such projects collaboratively is a very time-consuming process that typically spans many years within a chemist's career. In this work we aim to replicate this process by applying artificial intelligence learning-to-rank techniques on feedback that was obtained from 35 chemists at Novartis over the course of several months. We exemplify the usefulness of the learned proxies in routine tasks such as compound prioritization, motif rationalization, and biased de novo drug design. Annotated response data is provided, and developed models and code made available through a permissive open-source license.


Assuntos
Inteligência Artificial , Química Farmacêutica , Química Farmacêutica/métodos , Intuição , Descoberta de Drogas/métodos , Desenho de Fármacos , Aprendizado de Máquina
4.
J Med Chem ; 66(20): 14047-14060, 2023 10 26.
Artigo em Inglês | MEDLINE | ID: mdl-37815201

RESUMO

Early in silico assessment of the potential of a series of compounds to deliver a drug is one of the major challenges in computer-assisted drug design. The goal is to identify the right chemical series of compounds out of a large chemical space to then subsequently prioritize the molecules with the highest potential to become a drug. Although multiple approaches to assess compounds have been developed over decades, the quality of these predictors is often not good enough and compounds that agree with the respective estimates are not necessarily druglike. Here, we report a novel deep learning approach that leverages large-scale predictions of ∼100 ADMET assays to assess the potential of a compound to become a relevant drug candidate. The resulting score, which we termed bPK score, substantially outperforms previous approaches and showed strong discriminative performance on data sets where previous approaches did not.


Assuntos
Simulação por Computador
5.
J Chem Inf Model ; 63(15): 4497-4504, 2023 08 14.
Artigo em Inglês | MEDLINE | ID: mdl-37487018

RESUMO

Machine-learning and deep-learning models have been extensively used in cheminformatics to predict molecular properties, to reduce the need for direct measurements, and to accelerate compound prioritization. However, different setups and frameworks and the large number of molecular representations make it difficult to properly evaluate, reproduce, and compare them. Here we present a new PREdictive modeling FramEwoRk for molecular discovery (PREFER), written in Python (version 3.7.7) and based on AutoSklearn (version 0.14.7), that allows comparison between different molecular representations and common machine-learning models. We provide an overview of the design of our framework and show exemplary use cases and results of several representation-model combinations on diverse data sets, both public and in-house. Finally, we discuss the use of PREFER on small data sets. The code of the framework is freely available on GitHub.


Assuntos
Quimioinformática , Aprendizado de Máquina
6.
ACS Omega ; 8(2): 2046-2056, 2023 Jan 17.
Artigo em Inglês | MEDLINE | ID: mdl-36687099

RESUMO

Lipophilicity, as measured by the partition coefficient between octanol and water (log P), is a key parameter in early drug discovery research. However, measuring log P experimentally is difficult for specific compounds and log P ranges. The resulting lack of reliable experimental data impedes development of accurate in silico models for such compounds. In certain discovery projects at Novartis focused on such compounds, a quantum mechanics (QM)-based tool for log P estimation has emerged as a valuable supplement to experimental measurements and as a preferred alternative to existing empirical models. However, this QM-based approach incurs a substantial computational cost, limiting its applicability to small series and prohibiting quick, interactive ideation. This work explores a set of machine learning models (Random Forest, Lasso, XGBoost, Chemprop, and Chemprop3D) to learn calculated log P values on both a public data set and an in-house data set to obtain a computationally affordable, QM-based estimation of drug lipophilicity. The message-passing neural network model Chemprop emerged as the best performing model with mean absolute errors of 0.44 and 0.34 log units for scaffold split test sets of the public and in-house data sets, respectively. Analysis of learning curves suggests that a further decrease in the test set error can be achieved by increasing the training set size. While models directly trained on experimental data perform better at approximating experimentally determined log P values than models trained on calculated values, we discuss the potential advantages of using calculated log P values going beyond the limits of experimental quantitation. We analyze the impact of the data set splitting strategy and gain insights into model failure modes. Potential use cases for the presented models include pre-screening of large compound collections and prioritization of compounds for full QM calculations.

7.
J Chem Inf Model ; 62(23): 6002-6021, 2022 Dec 12.
Artigo em Inglês | MEDLINE | ID: mdl-36351293

RESUMO

In the drug development process, optimization of properties and biological activities of small molecules is an important task to obtain drug candidates with optimal efficacy when first applied in subsequent clinical studies. However, despite its importance, large-scale investigations of the optimization process in early drug discovery are lacking, likely due to the absence of historical records of different chemical series used in past projects. Here, we report a retrospective reconstruction of ∼3000 chemical series from the Novartis compound database, which allows us to characterize the general properties of chemical series as well as the time evolution of structural properties, ADMET properties, and target activities. Our data-driven approach allows us to substantiate common MedChem knowledge. We find that size, fraction of sp3-hybridized carbon atoms (Fsp3), and the density of stereocenters tend to increase during optimization, while the aromaticity of the compounds decreases. On the ADMET side, solubility tends to increase and permeability decreases, while safety-related properties tend to improve. Importantly, while ligand efficiency decreases due to molecular growth over time, target activities and lipophilic efficiency tend to improve. This emphasizes the heavy-atom count and log D as important parameters to monitor, especially as we further show that the decrease in permeability can be explained with the increase in molecular size. We highlight overlaps, shortcomings, and differences of the computationally reconstructed chemical series compared to the series used in recent internal drug discovery projects and investigate the relation to historical projects.


Assuntos
Descoberta de Drogas , Estudos Retrospectivos , Ligantes , Solubilidade , Bases de Dados Factuais
8.
Bioorg Med Chem Lett ; 64: 128667, 2022 05 15.
Artigo em Inglês | MEDLINE | ID: mdl-35276359

RESUMO

Inhibition of mutant activin A type-1 receptor ACVR1 (ALK2) signaling by small-molecule drugs is a promising therapeutic approach to treat fibrodysplasia ossificans progressiva (FOP), an ultra-rare disease leading to progressive soft tissue heterotopic ossification with no curative treatment available to date. Here, we describe the synthesis and in vitro characterization of a novel series of 2-aminopyrazine-3-carboxamides that led to the discovery of Compound 23 showing excellent biochemical and cellular potency, selectivity over other BMP and TGFß signaling receptor kinases, and a favorable in vitro ADME profile.


Assuntos
Miosite Ossificante , Ossificação Heterotópica , Receptores de Ativinas Tipo I , Humanos , Miosite Ossificante/tratamento farmacológico , Pirazinas/farmacologia , Pirazinas/uso terapêutico , Transdução de Sinais
9.
Mol Inform ; 41(6): e2100277, 2022 06.
Artigo em Inglês | MEDLINE | ID: mdl-34964302

RESUMO

The ability to predict chemical reactivity of a molecule is highly desirable in drug discovery, both ex vivo (synthetic route planning, formulation, stability) and in vivo: metabolic reactions determine pharmacodynamics, pharmacokinetics and potential toxic effects, and early assessment of liabilities is vital to reduce attrition rates in later stages of development. Quantum mechanics offer a precise description of the interactions between electrons and orbitals in the breaking and forming of new bonds. Modern algorithms and faster computers have allowed the study of more complex systems in a punctual and accurate fashion, and answers for chemical questions around stability and reactivity can now be provided. Through machine learning, predictive models can be built out of descriptors derived from quantum mechanics and cheminformatics, even in the absence of experimental data to train on. In this article, current progress on computational reactivity prediction is reviewed: applications to problems in drug design, such as modelling of metabolism and covalent inhibition, are highlighted and unmet challenges are posed.


Assuntos
Quimioinformática , Aprendizado de Máquina , Algoritmos , Desenho de Fármacos , Descoberta de Drogas/métodos
10.
J Mol Biol ; 433(24): 167309, 2021 12 03.
Artigo em Inglês | MEDLINE | ID: mdl-34687713

RESUMO

The NLRP3 inflammasome assembles in response to a variety of pathogenic and sterile danger signals, resulting in the production of interleukin-1ß and interleukin-18. NLRP3 is a key component of the innate immune system and has been implicated as a driver of a number of acute and chronic diseases. We report the 2.8 Å crystal structure of the NLRP3 NACHT domain in complex with an inhibitor. The structure defines a binding pocket formed by the four subdomains of the NACHT domain, and shows the inhibitor acts as an intramolecular glue, which locks the protein in an inactive conformation. It provides further molecular insight into our understanding of NLRP3 activation, helps to detail the residues involved in subdomain coordination within the NLRP3 NACHT domain, and gives molecular insights into how gain-of-function mutations de-stabilize the inactive conformation of NLRP3. Finally, it suggests stabilizing the auto-inhibited form of the NACHT domain is an effective way to inhibit NLRP3, and will aid the structure-based development of NLRP3 inhibitors for a range of inflammatory diseases.


Assuntos
Inflamassomos/antagonistas & inibidores , Proteína 3 que Contém Domínio de Pirina da Família NLR/antagonistas & inibidores , Proteína 3 que Contém Domínio de Pirina da Família NLR/química , Sítios de Ligação , Domínio Catalítico , Cristalografia por Raios X , Furanos/química , Furanos/farmacologia , Humanos , Indenos/química , Indenos/farmacologia , Inflamassomos/metabolismo , Domínios Proteicos , Sulfonamidas/química , Sulfonamidas/farmacologia
11.
J Chem Inf Model ; 61(6): 2623-2640, 2021 06 28.
Artigo em Inglês | MEDLINE | ID: mdl-34100609

RESUMO

Machine learning classifiers trained on class imbalanced data are prone to overpredict the majority class. This leads to a larger misclassification rate for the minority class, which in many real-world applications is the class of interest. For binary data, the classification threshold is set by default to 0.5 which, however, is often not ideal for imbalanced data. Adjusting the decision threshold is a good strategy to deal with the class imbalance problem. In this work, we present two different automated procedures for the selection of the optimal decision threshold for imbalanced classification. A major advantage of our procedures is that they do not require retraining of the machine learning models or resampling of the training data. The first approach is specific for random forest (RF), while the second approach, named GHOST, can be potentially applied to any machine learning classifier. We tested these procedures on 138 public drug discovery data sets containing structure-activity data for a variety of pharmaceutical targets. We show that both thresholding methods improve significantly the performance of RF. We tested the use of GHOST with four different classifiers in combination with two molecular descriptors, and we found that most classifiers benefit from threshold optimization. GHOST also outperformed other strategies, including random undersampling and conformal prediction. Finally, we show that our thresholding procedures can be effectively applied to real-world drug discovery projects, where the imbalance and characteristics of the data vary greatly between the training and test sets.


Assuntos
Algoritmos , Aprendizado de Máquina
12.
J Med Chem ; 63(23): 14425-14447, 2020 12 10.
Artigo em Inglês | MEDLINE | ID: mdl-33140646

RESUMO

This article summarizes the evolution of the screening deck at the Novartis Institutes for BioMedical Research (NIBR). Historically, the screening deck was an assembly of all available compounds. In 2015, we designed a first deck to facilitate access to diverse subsets with optimized properties. We allocated the compounds as plated subsets on a 2D grid with property based ranking in one dimension and increasing structural redundancy in the other. The learnings from the 2015 screening deck were applied to the design of a next generation in 2019. We found that using traditional leadlikeness criteria (mainly MW, clogP) reduces the hit rates of attractive chemical starting points in subset screening. Consequently, the 2019 deck relies on solubility and permeability to select preferred compounds. The 2019 design also uses NIBR's experimental assay data and inferred biological activity profiles in addition to structural diversity to define redundancy across the compound sets.


Assuntos
Bibliotecas de Moléculas Pequenas/química , Desenho de Fármacos , Avaliação Pré-Clínica de Medicamentos/métodos , Ensaios de Triagem em Larga Escala/métodos , Bibliotecas de Moléculas Pequenas/farmacologia
14.
J Chem Inf Model ; 60(7): 3331-3335, 2020 07 27.
Artigo em Inglês | MEDLINE | ID: mdl-32584031

RESUMO

We present an implementation of the scaffold network in the open source cheminformatics toolkit RDKit. Scaffold networks have been introduced in the literature as a powerful method to navigate and analyze large screening data sets in medicinal chemistry. Such a network can be created by iteratively applying predefined fragmentation rules to the investigated set of small molecules and by linking the produced fragments according to their descendence. This procedure results in a network graph, where the nodes correspond to the fragments and the edges correspond to the operations producing one fragment from another. In extension to the scaffold network implementations suggested in the literature, the presented implementation in RDKit allows an enhanced flexibility in terms of customizing the fragmentation rules and enables the inclusion of atom- and bond-generic scaffolds into the network. The output, providing node and edge information on the network, enables a simple and elegant navigation through the network, laying the basis to organize and better understand the data set being investigated.


Assuntos
Quimioinformática , Software , Química Farmacêutica
15.
J Chem Inf Model ; 60(6): 2888-2902, 2020 06 22.
Artigo em Inglês | MEDLINE | ID: mdl-32374165

RESUMO

We investigate different automated approaches for the classification of chemical series in early drug discovery, with the aim of closely mimicking human chemical series conception. Chemical series, which are commonly defined by hand-drawn scaffolds, organize datasets in drug discovery projects. Often, they form the basis for further project decisions. To trace and evaluate these decisions in historic and ongoing projects, it is important to know or reconstruct chemical series. There is not a unique correct definition of chemical series, and the human definition certainly involves a subjective bias. Hence, we first develop quality metrics for the chemical series definitions, evaluating the size and specificity of chemical series. These metrics are applied to categorize human series definitions and implemented in automated classification approaches. For the automated classification of chemical series, we test different fragmentation and similarity-based clustering algorithms and apply different approaches to infer series definitions from these clusters or sets of fragments. We benchmark the classification results against human-defined series from 30 internal projects. The best results in reproducing the composition of human-defined series are achieved when applying UPGMA (unweighted pair group method with arithmetic mean) clustering to the project dataset and calculating maximum common substructures of the clusters as series definitions. We evaluate this approach in more detail on a public dataset and assess its robustness by 10-fold cross-validation, each time sampling 40% of the dataset. Through these benchmarking and validation experiments, we show that the proposed automated approach is able to accurately and robustly identify human-defined series, which comply with a certain, predefined level of specificity and size. Suggesting a thoroughly tested algorithm for series classification, as well as quality metrics for series and several benchmarking approaches, this work lays the foundation for further analysis of project decisions, and it offers an enhanced understanding of the properties of human-defined chemical series.


Assuntos
Algoritmos , Benchmarking , Análise por Conglomerados , Humanos
16.
J Med Chem ; 63(16): 8824-8834, 2020 08 27.
Artigo em Inglês | MEDLINE | ID: mdl-32101427

RESUMO

Artificial intelligence (AI) is becoming established in drug discovery. For example, many in the industry are applying machine learning approaches to target discovery or to optimize compound synthesis. While our organization is certainly applying these sorts of approaches, we propose an additional approach: using AI to augment human intelligence. We have been working on a series of recommendation systems that take advantage of our existing laboratory processes, both wet and computational, in order to provide inspiration to our chemists, suggest next steps in their work, and automate existing workflows. We will describe five such systems in various stages of deployment within the Novartis Institutes for BioMedical Research. While each of these systems addresses different stages of the discovery pipeline, all of them share three common features: a trigger that initiates the recommendation, an analysis that leverages our existing systems with AI, and the delivery of a recommendation. The goal of all of these systems is to inspire and accelerate the drug discovery process.


Assuntos
Inteligência Artificial , Química Farmacêutica/métodos , Descoberta de Drogas/métodos , Pesquisa Farmacêutica/métodos , Química Farmacêutica/organização & administração , Bases de Dados de Compostos Químicos , Correio Eletrônico , Humanos , Pesquisa Farmacêutica/organização & administração , Pesquisadores/psicologia , Inquéritos e Questionários
17.
Chimia (Aarau) ; 73(12): 1001-1005, 2019 Dec 18.
Artigo em Inglês | MEDLINE | ID: mdl-31883551

RESUMO

Machine Learning and Data Science have enjoyed a renaissance due to the availability of increased computational power and larger data sets. Many questions can be now asked and answered, that previously were beyond our scope. This does not translate instantly into new tools that can be used by those not skilled in the field, as many of the issues and traps still exist. In this paper, we look at some of the new tools that we have created, and some of the difficulties that still need to be taken care of during the transition from a project run by an expert, to a tool for the bench chemist.

18.
Mol Inform ; 38(8-9): e1900031, 2019 08.
Artigo em Inglês | MEDLINE | ID: mdl-31169974

RESUMO

The generated database GDB17 enumerates 166.4 billion possible molecules up to 17 atoms of C, N, O, S and halogens following simple chemical stability and synthetic feasibility rules, however medicinal chemistry criteria are not taken into account. Here we applied rules inspired by medicinal chemistry to exclude problematic functional groups and complex molecules from GDB17, and sampled the resulting subset uniformly across molecular size, stereochemistry and polarity to form GDBMedChem as a compact collection of 10 million small molecules. This collection has reduced complexity and better synthetic accessibility than the entire GDB17 but retains higher sp3 -carbon fraction and natural product likeness scores compared to known drugs. GDBMedChem molecules are more diverse and very different from known molecules in terms of substructures and represent an unprecedented source of diversity for drug design. GDBMedChem is available for 3D-visualization, similarity searching and for download at http://gdb.unibe.ch.


Assuntos
Bases de Dados de Produtos Farmacêuticos , Preparações Farmacêuticas/química , Bibliotecas de Moléculas Pequenas/química , Química Farmacêutica , Avaliação Pré-Clínica de Medicamentos , Estrutura Molecular
19.
J Chem Inf Model ; 59(4): 1347-1356, 2019 04 22.
Artigo em Inglês | MEDLINE | ID: mdl-30908913

RESUMO

Several recent reports have shown that long short-term memory generative neural networks (LSTM) of the type used for grammar learning efficiently learn to write Simplified Molecular Input Line Entry System (SMILES) of druglike compounds when trained with SMILES from a database of bioactive compounds such as ChEMBL and can later produce focused sets upon transfer learning with compounds of specific bioactivity profiles. Here we trained an LSTM using molecules taken either from ChEMBL, DrugBank, commercially available fragments, or from FDB-17 (a database of fragments up to 17 atoms) and performed transfer learning to a single known drug to obtain new analogs of this drug. We found that this approach readily generates hundreds of relevant and diverse new drug analogs and works best with training sets of around 40,000 compounds as simple as commercial fragments. These data suggest that fragment-based LSTM offer a promising method for new molecule generation.


Assuntos
Quimioinformática/métodos , Redes Neurais de Computação , Preparações Farmacêuticas/química , Modelos Moleculares , Conformação Molecular
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...